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Abstract. Since language is tied to cognition, we expect the hnguistic structures to 
reflect patterns we encounter in nature and analyzed by physics. Within this realm 
we investigate the process of protolanguage acquisition, using analytical and tractable 
methods developed within physics. A protolanguage is a mapping between sounds 
and objects (or concepts) of the perceived world. This mapping is represented by 
a matrix and the linguistic interaction among individuals is described by a random 
matrix model. There are two essential parameters in our approach. The strength 
of the linguistic interaction /3, which following Chomsky's tradition, we consider as a 
genetically determined ability, and the number N of employed sounds (the lexicon size). 
Our model of linguistic interaction is analytically studied using methods of statistical 
physics and simulated by Monte Carlo techniques. The analysis reveals an intricate 
relationship between the innate propensity for language acquisition (3 and the lexicon 
size N, N ^ exp(/3). Thus a small increase of the genetically determined /? may lead 
to an incredible lexical explosion. Our approximate scheme offers an explanation for 
the biological affinity of different species and their simultaneous linguistic disparity. 
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1. Introduction 

Language has been a defining moment in the evolution of the human beings. It first 
appeared about 100000 years ago, in an eye-blink evolution, in the species Homo Sapiens. 
The sudden emergence and spread of language, like a viral epidemic, makes it hard to 
explain in terms of standard evolution, and echoes the reference to the evolution of 
language as the "hardest problem of science" [H [21 [3l H]. The language allowed an 
effective communication among the members of a human group, helped in transferring 
information from one generation to another, and even served as a systematic method 
to interpret the world, creating an endless semiotic process. The linguistic system is 
a highly generative system [5]. Few phonemes form a large number of words. Words, 
following relatively few basic "rules of composition" (or a syntax), form an infinity of 
phrases and sentences. Thus, language enables us to transfer unlimited information. 
This limitlessness has been described as "making infinite use of finite means" [6l [7] . 

Biology uses another exemplary generative system. Genomes consist of an alphabet 
of four nucleotides, which together with certain rules for how to produce proteins and 
organize cells, generates an unlimited variety of living organisms. Noam Chomsky, 
who revolutionized linguistic research, emphasized that the human faculty of language 
appears to be organized like the biological genetic code - hierarchical, generative, 
recursive, and virtually limitless with respect to its scope of expression. Our ability to 
understand and utter language is due to a universal grammar that is somehow hardwired 
within us [8]. Language develops just like any other organ in the human body: an innate 
program, founded in a "linguistic genotype" , supports linguistic growth, though the final 
"linguistic phenotype" is conditioned by experience. With these ideas in mind, one might 
wonder though, why our genetically closest relatives didn't develop something that is 
akin to language. Or, as it was already put by Darwin in his " Origin of Species" [9] : 

"not one author posed the question as to why in some animals the cognitive 
capabilities are developed more than in others, whereas such development should have 
been useful for all? Why monkeys did not acquire human intellectual capabilities?" 

In the present paper, we would like to draw attention to the oldest generative 
system, the physical world itself, and to its potential relevance for the language 
phenomenon. Despite its immense variety, nature can be analyzed and understood as 
a collection of few building blocks, the elementary particles (quarks, leptons, gauge 
particles). The elementary particles interact and form (or transformed to) larger 
compounds (nuclei, molecules, galaxies ) via the four well known interactions. We 
may view the elementary particles as constituting an "alphabet", and the interactions 
as providing the "rules of composition" (or "grammatical rules") to create the larger 
configurations. Within this analogy scheme, it is rather significant that the ancient 
Greeks were using the same word {aroLXSLO.) to denote both the letters of the alphabet 
and the constitutive elements of the universe. Language cannot be separated from 
cognition, which reproduces the world. Linguistic devices expressing quantity, tense, 
comparison, temporal or logical relations, embody patterns encountered in nature. In an 
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intense semiotic process, we constantly create mappings and analogies, sculpt outputs to 
match the external inputs. Nature then is reflected in our language and we dissect nature 
along lines laid down by language. This profound analogy, nature-human language flO\ . 
prompts us to use ideas and techniques encountered in physical theories, in order to 
study aspects of the linguistic dynamics. 

As a first step in this approach, we consider the learning of a protolanguage, 
employing a dynamical scheme inspired by the random matrix approach and statistical 
physics. A protolanguage is a mapping between sounds and objects (or concepts) of the 
perceived world. A protolanguage may be represented by an association matrix and a 
population of individuals (humans or other animals) are using for their communication a 
specific association matrix [HI [12]. Another individual (a newcomer, or a newborn) may 
use a different association matrix, selected randomly among the possible languages. We 
expect then that the interaction of the single individual with the population, to lead to 
a "realignment" of her (his) linguistic expression upon the language of the community. 
Our model simulates this process as a matrix-matrix interaction and the equilibrium 
reached is analyzed using the methods of statistical physics. 

There is already a significant interdisciplinary research on the evolutionary aspects 
of human language. Such an interest is a direct consequence of the rapid advances in 
the field of complexity |13j. Complex systems comprising of many interacting units are 
studied using the principles of Statistical Physics, even though the interacting units are 
no longer atoms as in traditional physics applications, but biological species [HI [15] . 
human beings [Ml [I3|, or financial markets [iHl HE]. Human language, which 
traditionally was viewed as a rather qualitative subject of study, fits adequately in 
the above dynamical framework. A study of the language, inspired by evolutionary 
dynamics, has been rigorously explored by Nowak and collegues[3l[5]. The areas of study 
include also linguistic games [20], language competition between two [21], [221 [23], [2l] 
or more languages [251 [261 [271 [2HI [29l [30l [3ll [32] to the quantification of language 
characteristics and their explanation from first principles [33] [M] . The mathematical 
framework of language modeling and simulation has already given some rather intriguing 
results. Abrams and Strogatz [35] have proposed a simple model of non- linear differential 
equations which describe rather well the distribution of spoken languages and several 
extensions of this model have been subsequently studied [211 [M]- Several agent based 
models of language competition have been proposed [251 [33 [21] and the probability 
distribution of spoken languages has been described with considerable accuracy [381 [39] . 
Recently, there has been an interesting attempt for a systematic study on the influence of 
the geography [21] on language competition, an original attempt to describe linguistic 
aspects in terms of random matrices [39] and a study on the network properties of 
written human languages [iO] . 

In section 2, we present in detail our model, including analytic approximations and 
Monte Carlo simulations. In section 3 we present the main results and discuss their 
importance. Our conclusions and directions for future work are presented in section 4. 
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2. Model and Methods 

2.1. Analytical description 

We imagine a group of individuals, which have established a simple communication 
system, by using sounds to encode meaning. Suppose that we have "objects" and 
each individual object is denoted by a distinct sound (a total of sounds). The mapping 
of objects to sounds is specified by an x active matrix P, whose elements are either 
one or zero. For example, the entry pij = 1 implies that the object i is associated with 
the sound j. Every time a speaker wishes to refer to object i he is using the sound j. 
Next to the mapping from object to sounds, there is another mapping from sounds back 
to objects, specified by the N xN passive matrix Q. Again the elements qnm = 1 implies 
that a listener hearing sound n will infer object m. Language involves both speaking 
and listening and the linguistic code of an individual L{P, Q) is defined by these two 
matrices O |TTl [12] . It is obvious that the maximum effectiveness of communication is 
achieved when the matrices P and Q are connected, 

P^j = Qji (1) 

How many linguistic codes may we have? Matrix P, as well as Q, is constructed as a 
permutation matrix; that is, there is a single entry equal to one per row and column, 
all other entries being zero. There are A^! possible ways to associate A^ objects to A^ 
sounds and therefore A^! distinct linguistic codes. An established community advancing 
through sharing and exchanging information, is using the same language L{P, Q). An 
individual, not a member of the community (a newcomer, or a newborn) might be using 
another language L'(P', Q'), chosen randomly among the A^! possibilities. Some of the 
associations object-sound (or sound-object) might be the same in both languages L and 
L', while others may be different. 

The interaction between A using language L and A' using language L' is quantified 
[3 [12] by the "communication energy" E 

E{L, L') = -^ ^iPijl'ji + P'ijiji) (2) 

E is a direct measure of the communication success, the ability of A to convey 
information to A' and vice- versa. The first term Pijq'j^ denotes the possibility that 
speaker A successfully communicates object i to listener A', while for the second term 
p'ijqji the speaker-listener relationship is reversed. If the same language is used, taking 
into account the condition Eq.([T]), we obtain 

E{L,L) = -N (3) 

marking the ideal communication. In general, for two different languages 
miscommunication occurs, resulting from the different assignments of objects to sounds. 
We expect in general that 



E(L,L') = 



—m < m < N. 



(4) 
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where m is the number of common semantic associations the two hnguistic codes 
have. It is expected that the single individual, in a continuous interaction with the 
surrounding environment which is using the definite code L, will increase the number 
of the common object-sound associations, thus stepping up the acquisition of the L 
language. Within our model, this is achieved by providing a higher weight to the 
languages with an increased "correct" identification of objects to sounds. Following the 
experience from systems in equilibrium this statistical weight is chosen as exp{—j3E). 
With (3 we represent the strength of the linguistic interaction. Large values of f3 favor 
the " alignment" of the linguistic choices, i.e. codes resembling L are strongly favored. 
Low (3 values allow the presence of a variety of languages. Our approach, considering 
the linguistic interaction as an intense one leading to equilibrium, suggests that we may 
use techniques from Statistical Physics. We define then the partition function as 

Z = J2^M-PE{L,L')] (5) 

L' 

The summation is carried out over all possible " linguistic states" L', while the code L 
appears as a constant external field. Taking into account Eq.(jll), we obtain 

Z = ^g{m) exp(/3m) (6) 

m 

where g{m) is the multiplicity of languages sharing m semantic associations with L. To 
evaluate g{m), we start with the encoding in language L considered as a permutation, 
and generate all other permutations keeping m assignments fixed. We may select in 
Q different ways the m fixed assignments, while for the other N — m elements the 
permutation is a derangement. A derangement means that none of the elements may 
appear in its original position. The multiplicity g{m) is then equal to 

g{m) = ^ ^ j D{N - m) (7) 

where the number of derangements is given by (the mathematical details may be found 
in the appendix): 

111 f-l)'' 

Notice that ^(0) = 1, L'(l) = 0, while for large k 

A;' 

D{k) ^ - (9) 

e 

An accurate expression for the partition function is obtained then 

AT] m/3 

^ +exp(/5Ar) (10) 



e — ' m\ 

m=0 



All measurable quantities concerning our system can be derived from the partition 
function. Language acquisition can be measured by studying the average number of 
common associations (m), given by 

= = ^§ (11) 

g{mje^^"^ Z op 
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Fluctuations around the mean value can be estimated as 

Am' = {m') - {mf = (12) 

Our simple linguistic system reveals interesting correlations between the interaction 
strength f5 and the size of the lexicon N. Consider first the case of small f3, where 
many terms contribute in eq.f fTOl) . The terms entering in the summation build up an 
exponential which dominates, and the partition function is then approximated by 

Z~ — exp[exp(/5)] (small /3) (13) 

e 

From Eq. (ITT!) we find 

(m) ~ exp(/5) (small P) (14) 

We notice that the number of acquired words increases exponentially with the interaction 
strength f3. Another way of stating our result is that a small increase in /3, which may 
be considered as an innate propensity for language acquisition, provokes an exponential 
growth of the size of the available lexicon. The spread around the average value (m) is, 
using Eq. ([HD-dlSl), 

Am' ~ exp(/?) (small /?) (15) 

The spread is significant since many linguistic states contribute to the mean value. 
For large /3 values the important contributions are coming from languages very similar 
to L. In this case 

Z ^ + ^^^^Lz2leP(^-') (large /3) (16) 

The mean value (m) is 

(m) ~ (large /?) (17) 

while the spread around the mean value decays exponentially as /? increases. The 
crossover between the two regimes, small /? vs large /5 values, occurs at 

/3cr^lniV (18) 

The underlying dynamics is manifested when we consider the entropy S 

5 = -/3^ + lnZ (19) 

Using Eq. (IT^ we find 

^ = ArinA^-/?exp(/?) (20) 

At small /3 values the entropy is large, reflecting that all A^! codes contribute, while 
as /3 approaches /?cr the entropy becomes zero since only one code contributes. The 
theoretical analysis is supported by a detailed Monte-Carlo simulation. 



A Random Matrix Approach to Language Acquisition 



7 



2.2. Monte Carlo Simulations 

We simulate the learning process described by the above model using the following 
algorithm. First an initial sequence of integers from 1 to is chosen at random to 
represent the "optimal" language which has to be learned (understood and memorized) 
by a learning agent. Then, another permutation is chosen to represent the language of 
this learning agent. When the two sequences have the same number at the same position 
this is considered to be a success, meaning that the learner has correctly identified the 
meaning of this word and associated it with the proper object. We compare the two 
sequences and count the number m of successes. Then, a random pair of the N elements 
of the learners vocabulary is chosen and their position is interchanged. Again the new 
number of successes is calculated. If ninew is greater than the previous moid the 

flip is accepted with probability one. Otherwise it is accepted with probabihty 

p = exp(/3Am) (21) 

where Am = (m„e«) — frtoid)-, as is typically done in Statistical Mechanics simulations 
with Metropolis dynamics. We continue this iterative process until the system reaches 
an equilibrium state and we calculate (m) and Var(m) by averaging our results over 500 
initial system realizations. 

3. Results and Discussion 

In Fig. [H we plot the mean number of words (m) after the system has reached the 
equilibrium state divided by the language size for several system sizes, namely 
for = 10, 300, 700, 1000. The analytic results (solid lines) and the Monte Carlo 
Simulations (points) are in excellent agreement. We observe that for small /3, the 
fraction tends to zero while for large j3 it becomes equal to one and that there 
exists a crossover between the two states at a crossover value /3cr which depends on 
A^. In fact, it seems that (3cr increases monotonically with increasing A^ and that for 
given (5 there is always a language size N{P), such as below (3 the learning fraction is in 
the "zero" state and above it is in the "one" state. This aspect is a characteristic of a 
crossover phenomenon in contrast to a phase transition where there is a critical value of 
a control parameter which does not depend on system size in such a manner and which 
remains finite even for infinitely large system sizes. 

In order to check the validity of our approximation for small language sizes we 
present, in Fig. [21 a log-linear plot of the mean number of words (m) in equilibrium 
as a function of (3 for system sizes A^ = 10, 50, 100, 300, 700, 1000. We observe that for 
small /3 the points are in straight lines indicating an exponential dependence of (m) on 
p. Moreover, the data collapse indicates the independence of (m) on the language size 
A^ in complete agreement to our analytical predictions, Eq. iHM . 

Next, we study the variance Var(m) of the vocabulary size m. Figure [3] shows 
the number of words (m) and variance Var(m) vs (3 for language size A^ = 1000. The 
triangles are Monte Carlo simulation results and the solid line is equal to exp(/3). The 
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Figure 1. Mean number of words (normalized), {m)/N vs parameter j3 for N = 
10,300,700,1000 (black, blue, red, green). Symbols are results of Monte Carlo 
Simulation and solid lines are numerical solutions of Eq. 




Figure 2. Log-Linear plot of the mean number of words (m) vs (3 for = 
10,50,100,300,700,1000. Notice the data collapse for small (3 values indicative of 
an exponential scaling independent of N . 
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Figure 3. Mean number of words (m) and variance Var(m) vs (3 for language size 
N — 1000. Triangles are Monte Carlo Simulation results for (to) (white) and VarTO) 
(black) and the solid line is equal to exp(/?). 



collapse of the points indicates that up to a characteristic crossover value Per both (m) 
and Var(m) increase with increasing f3 and they are both equal to exp(/5) in agreement 
with our analytical prediction. Above jScr the equilibrium vocabulary size assumes with 
high probability its maximum value, thus there is a decrease in the fluctuations of m 
while a sharp maximum of Var(m) is observed at jScr- 

Finally, we examine how the crossover value f3cr scales with the language size 
A^. We determine Per from the position of the maximum of Var(m). Figure H] shows 
that Per ~ InA^ in quite good agreement with the analytical prediction. The physical 
signiflcance of Per is that it determines a minimum of linguistic ability that is required 
by an individual for efficiently learning a language of size A^. This scaling implies that 
a small increase of the ability parameter P will have a profound impact on language 
learning as it may lead from a "zero" state for the effective vocabulary (below Per) to 
the "one" state of successful learning (above Per)- 
Our model allows a comparative analysis of animal communication. 
Following Chomsky [H E], a strong connection between biology and linguistics has 
been promoted, with genetically determined rules controlling the linguistic ability. The 
parameter P represents in an effective way this genetically determined linguistic ability 
and different species have different values. A given species, qualified by linguistic ability 
P, may acquire and use a language consisting of up to N words, where 

N ~ exp{p) (22) 

Notice that a small increase in P, the biological ability for language acquisition, induces 
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Figure 4. Crossover value Per versus the language size N. Squares are estimates of 
the crossover from the maximum of the Var(m) and the solid line is equal to IniV. 
Human average vocabulary is « 60000 words, while birds use roughly w 1000 sounds 
and apes understand even less than that. 

an exponential growth of the size of the available lexicon. Trained apes can learn 50-200 
words, the most well known case being the bonobo chimpanzee named Kinzi [51] . This 
size of the lexicon is reproduced by a /3 of about 4. Songbirds display a richer lexicon 
of about 700 words [12], corresponding to a value 6 for (3. An average high-school 
graduate has a lexicon of about 60000 words ^ |6], giving a /5 value close to 11 for the 
human species. We observe that a relatively small range in the genetically determined 
f3 parameter gives rise to immense variations in the size of the employed lexicon(see 
fig. H]). Thus we have an approximate scheme, which can accommodate the biological 
affinity of some species and their linguistic disparity. 

4. Conclusions and future directions 

We are dealing with language and it is not appropriate to consider it as an isolated 
system. Rather we hope to capture aspects of the complex linguistic phenomenon by 
resorting to a highly interdisciplinary method. In our paper we suggested that models 
and techniques developed within physics might be useful in deciphering the language 
riddle. The rationale behind the indicated course is that since language is strongly tied 
to cognition, we expect the linguistic structures to reflect structures and patterns we 
encounter in nature and analyzed by physics. This profound interrelationship nature 
- human language is a permanent and continuous one and lies at the very foundation 
of the "intelligibility" of the universe. As a flrst step we considered the most simple 
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language, a protolanguage, which is essentially a mapping between sounds and objects. 
This mapping is represented by a matrix and the language interaction is simulated by 
random matrix mechanics. The suggested interaction Hamiltonian between the matrices 
is (see eq. [2]) 



Our simple model bears great resemblance to a well known and extensively studied 
problem in physics, magnet-magnet interaction. A magnet may have one direction in 
space, chosen among a given set of possible directions. When many magnets are brought 
together, it is expected that the interaction among the magnets to lead the magnets to 
acquire a common direction in space, rather than each magnet having its own direction. 
This common field is described as mean field and an individual field (a magnet, or 
a particle) interacts with this average mean field. A particular matrix version of the 
mean field technique may be found in ref.[l3] , and our model Hamiltonian is very 
similar to theirs. In a similar vein, a protolanguage appears as a specific choice among a 
huge number (A^!) of possibilities. Social interaction among the different partners, each 
using its own protolanguage, will lead eventually to the adoption of a unique collective 
"mean protolanguage", L{P,Q) in our case. It is with this "mean protolanguage" 
that an individual will interact, the interaction being described by eq. [231 Random 
matrices have been widely used in Nuclear and Particle Physics and in general in systems 
involving large numbers of degrees of freedom [^1^. Matrix models are directly linked 
to string theory [l6] , the theory unifying all interactions in nature [I?] . Also it has been 
shown recently that relational logic and category theory are expressed by matrix models 
[38] . Thus, our proposal opens the possibility for a fruitful interaction between linguistics 
and advanced sectors of theoretical and mathematical physics. 

We adopted Chomsky's vision that language acquisition is rooted in innate 
structures and innateness comes in degrees. This linguistic innateness is represented in 
our model by the effective parameter /3, having different values for the different biological 
species. We can only advance hypotheses about what lies behind the dispersion of j3 
values, the innate propensity for language acquisition. It has been suggested that the 
human brain, being relatively larger than that of other primates, runs a significantly 
larger number of neural interconnections [l9], leading to a high (3 value for the humans. 
Along a different line, neurobiologists have identified the gene F0XP2 as directly 
affecting the language ability in humans [21 HI [50]. The presence and the specific 
functioning of similar genes in other primates and the songbirds is of prime importance 
[501 [5T] . Bipedalism also has been considered as a factor favoring the development of 
language. Upright posture sets the hands free for alternate uses [52] and provides a 
frontal and wide view of the environment, thus increasing the stimulus for cognition 
and symbolic expressions. 

Spoken languages leave no fossils and consequently it is not easy to infer the 
language evolution. But as Simon Conway Morris argues, "it would be strange if my 
fingers and eyes were to have an evolutionary origin but not my capacity to speak" [53j . 





(23) 
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Two evolutionary scenarios have been advanced, a gradual and mosaic one, where 
language follows the pattern of most evolutionary events (like the long evolution of 
eye) and an abrupt one, where language emerges in a single step process [2]. Our 
work offers a further step of this intricate issue. The relationship between the innate 
propensity for language {j3) and the lexicon richness is a continuous one, as displayed 
in fig. [H One notices, though, the abrupt transition from poor linguistic achievement 
{P < Per) to high linguistic achievement (/3 > Per)- A small increase of P may lead 
to an incredible evolutionary leap, which may be qualified as a "lexical big bang". In 
that way we may interpret the apparent language discontinuity between humans and 
the other hominoids. 

More than 50 years of research using classical training studies demonstrates that 
animals (apes, parrots, pigeons, rats) can acquire a number of words or concepts|6]. 
With regard to number quantification animals can represent numbers up to a maximum 
(around 9)[5l]- As the target number increases, the standard deviation around the 
matched mean increases accordingly. This spread around the mean value is reproduced 
by our model, see eqs. [15] and [H] for small p. On the other hand, humans are unique in 
the ability to show an open-ended quantification skill, including discrete infinity among 
the numbers. We attribute again this human capacity to a corresponding large P value. 

There is a strong tendency to advocate a modular dissociation between lexicon and 
grammar, between protolanguage and fully developed language. Bates and Goodman 
have provided evidence that the emergence of grammar is highly dependent upon the 
lexicon size [55]. Thus the degree of grammatical competence acquired by children is 
strictly linked to the lexical stage at which they are. Children with lexicons under 300 
words have very restricted grammatical abilities. Viewed in this light, chimpanzees, 
with a lexicon of 200 words, appear to be arrested at a point in lexical development 
when grammar is still at a very simple level[55]. This type of approach is corroborated 
by the experimental finding that songbirds, possessors of a richer lexicon composed of 
700 sounds, recognize acoustic patterns defined by a recursive, self-embedding, context- 
free grammar |32]. Further along is the language of the human primate, with a much 
larger lexicon and considerably richer grammar. A comparison reveals that while on 
biological grounds we are close to the other primates, on linguistic grounds we are closer 
to birds (the human as a singing ape was described by Darwin back in 1871 [19]). FigHJ 
displaying the genetic propensity for language P vs lexicon size N, may be viewed with 
the coordinate N representing also the grammatical complexity. 

Our exploration of reality is always mediated by language or a general semiotic 
process. Next to the real world, we create an entire world of symbols, organized 
internally by the different forms of language. The symbolic world is substantiated by 
individual cognitive units (neurons), joined and operated by vastly unknown physical 
mechanisms. Or as Noam Chomsky put it: 'We know very little about what happens 
when 10^° neurons are crammed into something of the size of a basketball, with further 
conditions imposed by the specific manner in which this system developed over time [8]". 
And later: "It may be that at some remote period a mutation took place that gave 
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rise to the property of discrete infinity, to be explained in terms of the property of 
physical mechanisms, now unknown |56]". Symbols and words are organized into finite 
strings (sentences), following a finite number of grammatical rules, through the recursive 
application of these rules. We consider that the grammatical parsing of languages bears 
resemblance to the parsing of the natural processes occurring in the world. Both of 
them may be simulated by random matrix dynamics, involving interaction terms more 
complex than the one considered in the present paper (eq. [23]). We hope that this type 
of approach, incorporating ideas and models from physics into the language research, 
will appear fruitful and interesting in the future. 

Appendix 

A derangement is a permutation in which none of the elements of the set appear in 
their original positions. Or considered as a bijection f : S ^ S , the derangement 
does not allow an element x G S* with /(x) = x. To find the number of derangements 
of an ra-element set S, the inclusion- exclusion principle has been used. The set of 
all permutations P of the set 5* has cardinality \P\ = n\. To obtain the number of 
derangements we have to subtract from the total number of permutations those which 
map an element to itself. Let us call Ai{l < i < n) the set of all permutations that 
map the ith element to itself. Then = {^){'n — !)!• This process leads to 



underestimation since the subtraction involves twice the permutations having two fixed 
points. We should add then Y.i<j l^fl^il = {T)^^ ~ 2)'- 

Again, we reach an underestimation, since in the previous summation we have 
included twice the permutations involving three fixed points. This type of analysis 
continues until we reach the n*^ term, and the number of derangements emerge as a 
sum with alternating signs 
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